Data preparation

Setting up a session

# Loaded packages with their versions
sessionInfo() # to check your locale
## R version 4.3.1 (2023-06-16)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.5
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Kiev
## tzcode source: internal
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] gridExtra_2.3     sf_1.0-15         easyclimate_0.2.1 lubridate_1.9.3  
##  [5] forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.2      
##  [9] readr_2.1.5       tidyr_1.3.1       tibble_3.2.1      ggplot2_3.5.0    
## [13] tidyverse_2.0.0  
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.9         utf8_1.2.4         generics_0.1.3     class_7.3-22      
##  [5] KernSmooth_2.23-21 stringi_1.8.3      hms_1.1.3          digest_0.6.35     
##  [9] magrittr_2.0.3     evaluate_0.23      timechange_0.2.0   fastmap_1.1.1     
## [13] jsonlite_1.8.8     e1071_1.7-14       DBI_1.2.2          fansi_1.0.6       
## [17] scales_1.3.0       jquerylib_0.1.4    cli_3.6.2          rlang_1.1.3       
## [21] units_0.8-5        munsell_0.5.1      withr_3.0.0        cachem_1.0.8      
## [25] yaml_2.3.8         tools_4.3.1        tzdb_0.4.0         colorspace_2.1-0  
## [29] vctrs_0.6.5        R6_2.5.1           proxy_0.4-27       lifecycle_1.0.4   
## [33] classInt_0.4-10    pkgconfig_2.0.3    pillar_1.9.0       bslib_0.7.0       
## [37] gtable_0.3.4       glue_1.7.0         Rcpp_1.0.12        xfun_0.43         
## [41] tidyselect_1.2.1   rstudioapi_0.16.0  knitr_1.45         htmltools_0.5.8.1 
## [45] rmarkdown_2.26     compiler_4.3.1

Since we have nights (“days”) with very few bats captured, we decided to drop the nights with lesser bats captured than the threshold.

Exploratory analysis

Check how many bats we captured during the night, per territory.

Growth metrics distribution

For each bat captured, we measured two growth-related numerical variables - forearm length (Ra, mm), and weight (W, g).

What the statistical distribution of response variables? Distribution in both response variables diviate from normality.

How Ra related to W?

Bats with bigger forearm (Ra) have higher mass (W), and vice versa. Atthe same time, there is a visible differences among years.

Sex dependency

Is there any sex differences in growth metrics? Is that possible that male and female bats growth different?

## 
##  Shapiro-Wilk normality test
## 
## data:  df$Ra
## W = 0.99056, p-value = 0.0001243

## 
##  Bartlett test of homogeneity of variances
## 
## data:  Ra by sex
## Bartlett's K-squared = 0.21128, df = 1, p-value = 0.6458
## 
##  Shapiro-Wilk normality test
## 
## data:  df$W
## W = 0.98967, p-value = 5.125e-05

## 
##  Bartlett test of homogeneity of variances
## 
## data:  W by sex
## Bartlett's K-squared = 0.82962, df = 1, p-value = 0.3624

To select a method for assessing sex dependent variance, we first checked homogeneity of variances and normality. Bartlett test indicated good homogeneity for both Ra and W, but Shapiro-Wilk test indicated deviation from normality. On the other hand, visual exploration suggest highly symmetrical distribution. Thus, we tested differences between male and female bats using t-test:

Ra appeared to be sex-dependent, Weight did not show such pattern.

To reveal possible interactions with sex, we then took a look at the sex-dependent variation of Ra/W across the years of sampling.

In our data set, male/female ration in Ra seemed to be constant across the years of survey - male young bats had always smaller arm than female ones. Contrary, W did not show a constant pattern across the year.

Therefore, to test our hypothesis about spring weather effect on bat growth, we must fit models that include sex as either nominative explanatory variables with interactions or random effect.

Changes over the inventory time

Bat survey (inventory) lasted approximately two weeks in July, after all of the juveniles turn to be able to fly. Therefore, we can expect some directed changes in bat body measures over this period. We tested the relations between growth metrics (Ra and W) and day-of-the-year, controlling different years.

Ra did not seen to change over the survey period, whereas W constantly increased. If we compare study years, slopes of simple linear models slightly varies, as well as intercepts, but remains close to the horizontal (Ra seems to vary indifferently either year or capture date). By contrast, W shows constant slopes over the years, but differs in intercepts. In other words,

  • W does changes over the single survey period,
  • W steadily increased,
  • there are quantitative differences in average bat W across the years, but the way W changed over the survey period.

We also check if some sex-dependent differences in W exist in each year.

To numerically check different growth metrics’ variation during inventory time, we fit a simple linear models per years.

## [[1]]
##      Coefficient    p_value sign_level
## 2008  0.03251899 0.16228879           
## 2009  0.03996433 0.09580011           
## 2011 -0.01839296 0.42644586           
## 2014  0.13586352 0.02530808          *
## 2019  0.07414264 0.01202252          *
## 
## [[2]]

Ra is weakly related to the time of capture.

## [[1]]
##      Coefficient      p_value sign_level
## 2008   0.3838443 6.407741e-16          *
## 2009   0.3115096 6.578883e-12          *
## 2011   0.3305665 1.171298e-12          *
## 2014   0.3179832 5.773326e-03          *
## 2019   0.2251565 3.862218e-06          *
## 
## [[2]]

W strongly related to the time of capture.

As we can see, Ra generally remains constant over survey period, and probably doesn’t differ among years. Weight, instead, (i) differs among years, and (ii) grows over the survey period.

In other words, bats captured later in a year might have higher weight, regardless of spring weather conditions. To minimize an effect of time within inventory period, we calculated a synthetic, capture-time-independent growth metrics called Body Mass Index (BMI), and test its variation over the inventory period and between the years. For each bat capture event, BMI was calculated as the residuals from a linear regression of the logarithmically transformed weight (W) against the logarithm of the day-of-the-year (day).

Log-transformation scales BMI from -1 to 1, which facilitates its interpretation. Inspired by Mikryukov et al. 2023, and, partly, Scaled mass index - SMI by Peig and Green 2009

Check BMI’s sensitivity to day-of-the-year:

As we can see on the Figure above, BMI isn’t sensitive to bat capture date within a survey period, but reflects differences among years.

## [[1]]
##        Coefficient   p_value sign_level
## 2008  0.0027457969 0.1305312           
## 2009 -0.0007378799 0.6700145           
## 2011  0.0009869243 0.6133391           
## 2014  0.0012891446 0.8093400           
## 2019 -0.0025366965 0.2552311           
## 
## [[2]]
## `geom_smooth()` using formula = 'y ~ x'

Similarly, BMI shows no statistically significant dependency on capture date (day-of-the-year), even when we treat it for each year separately.

Therefore, we choose to test two alternative response variables - forearm length (Ra) and Body Mass Index (BMI), as a possible growth metric in our research.

Generate weather data

# Take a look at the timespan
summary(df$date)
##         Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
## "2008-07-02" "2009-07-08" "2011-07-05" "2011-07-23" "2011-07-17" "2019-07-18"

Monthly climate

Here we explore the relationship among bat forearm length, weight, and Body Mass Index and simplest possible aggregated climatic variables - average mean temperature, average minimal temperature, and cumulative precipitation. Time period for aggregation - one calendar month.

daily %>% 
  mutate(month = lubridate::month(date)) %>% 
  mutate(Year = as.factor(lubridate::year(date))) %>% 
  group_by(place_id, Year, month) %>% 
  summarise(sum(Prcp), mean(Tmean), mean(Tmin)) %>% 
  rename(Prcp = 4, Tmean = 5, Tmin = 6) -> aggregated_clim
## `summarise()` has grouped output by 'place_id', 'Year'. You can override using
## the `.groups` argument.
df %>% 
  select(Territory, place_id, Year, Ra, W, BMI) %>% 
  left_join(aggregated_clim) -> monthly_clim
## Joining with `by = join_by(place_id, Year)`
rm(aggregated_clim)

Ra

# Ra
monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Tmean, y = Ra)) +
  geom_point(shape = 1) +
  # geom_smooth(method = "lm") +
  # geom_smooth() +
  stat_smooth(method = glm) +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Tmin, y = Ra)) +
  geom_point(shape = 1) +
  # geom_smooth(method = "lm") +
  # geom_smooth() +
  stat_smooth(method = glm) +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Prcp, y = Ra)) +
  geom_point(shape = 1) +
  # geom_smooth(method = "lm") +
  # geom_smooth() +
  stat_smooth(method = glm) +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

W

# W
monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Tmean, y = W)) +
  geom_point(shape = 1) +
  geom_smooth(method = "lm") +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Tmin, y = W)) +
  geom_point(shape = 1) +
  geom_smooth(method = "lm") +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Prcp, y = W)) +
  geom_point(shape = 1) +
  geom_smooth(method = "lm") +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

BMI

# BMI
monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Tmean, y = BMI)) +
  geom_point(shape = 1) +
  geom_smooth(method = "lm") +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Tmin, y = BMI)) +
  geom_point(shape = 1) +
  geom_smooth(method = "lm") +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

monthly_clim %>% 
  filter(month %in% c(1:6)) %>%
  ggplot(aes(x = Prcp, y = BMI)) +
  geom_point(shape = 1) +
  geom_smooth(method = "lm") +
  facet_wrap(vars(month), scales = "free")
## `geom_smooth()` using formula = 'y ~ x'

Climwin

# Vignettes
# library(climwin)
# 
# vignette("climwin", package = "climwin")
# vignette("advanced_climwin", package = "climwin")

# Load necessary libraries
library(climwin)
library(lme4)

# Set priors

cdate <- daily$date  # date var in climate data
bdate <- df$date     # date var in biological data

# Relative or absolute time response.
# If "absolute", reference day as day, month is needed
type <- "absolute"
refday <- c(01, 07)

# If "relative", time window calculated for each individual value
# type <- "relative"
# refday <- NULL

# Temporal resolution of climate data
# cinterval <- "day"
cinterval <- "week"

# Upper and lower limit for tested climate windows respectively
# Must correspond to the resolution chosen in ‘cinterval’
# if 'cinterval' set to "days"
# range <- c(250, 0)

# if 'cinterval' set to "week"
# range <- c(30, 0) # (~ 1 Dec)
range <- c(23, 0) # ~ 15 Jan

stat <- "mean"

func <- "lin"

# number of repeats for randomization
# repeats <- 5 # for metric = "C"
repeats <- 200 # for metric = "AIC"

# Metrics for assessing
# metric = "C"
metric = "AIC"

# Number of years in data
sample.size = nlevels(as.factor(as.vector(df$Year)))

Temperature

Here we explore two temperature predictors: Mean (Tmean) and Minimal (Tmin) daily temperatures, averaged across time window.

# Set priors
# List of dependent variable candidates
xvar <- list(Tmean = daily$Tmean, Tmin = daily$Tmin)

Ra

# Null model with no climate signal
baseline <- lm(Ra ~ 1, data = df)

Candidate model fitting

# Fit candidate model set
# RaWin <- slidingwin(xvar = xvar,
#                     cdate = daily$date,
#                     bdate = df$date,
#                     baseline = baseline,
#                     cinterval = cinterval,
#                     range = range,
#                     type = type, refday = refday,
#                     stat = stat,
#                     func = func,
#                     spatial = list(df$place_id, daily$place_id))
# 
# save(RaWin, file = paste0(modeldir, "RaWin.Rdata"))
# rm(RaWin)
# gc()
# 
# Fit randomized model set for evaluation purposes
# RaRand <- randwin(repeats = repeats,
#                   xvar = xvar,
#                   cdate = cdate,
#                   bdate = bdate,
#                   baseline = baseline,
#                   cinterval = cinterval,
#                   range = range,
#                   type = type, refday = refday,
#                   stat = stat,
#                   func = func,
#                   spatial = list(df$place_id, daily$place_id))
# 
# save(RaRand, file = paste0(modeldir, "RaRand.Rdata"))
# rm(RaRand)
# gc()

Model diagnostics

load(file = paste0(modeldir, "RaWin.Rdata"))
# Possible combinations of model parameters
RaWin$combos
##   response climate     type stat func DeltaAICc WindowOpen WindowClose
## 1       Ra   Tmean absolute mean  lin     -8.75          7           5
## 2       Ra    Tmin absolute mean  lin     -8.27         21          21
candidate_models <- list()

for (i in 1:(length(RaWin)-1)) {
  candidate_models[[i]] <- head(RaWin[[i]]$Dataset)
}

candidate_models
## [[1]]
##     deltaAICc WindowOpen WindowClose   ModelBeta  Std.Error ModelBetaQ
## 31  -8.754531          7           5 -0.11783871 0.03582217         NA
## 60  -8.176073         10           6 -0.18569572 0.05804150         NA
## 23  -8.133348          6           5 -0.07812802 0.02447159         NA
## 232 -8.068939         21          21  0.03628371 0.01140142         NA
## 73  -8.025189         11           5 -0.17088155 0.05381371         NA
## 61  -7.833059         10           5 -0.14672686 0.04665852         NA
##     ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 31          NA 56.02267      lin       23       0       mean absolute 0
## 60          NA 56.60265      lin       23       0       mean absolute 0
## 23          NA 55.43276      lin       23       0       mean absolute 0
## 232         NA 54.26605      lin       23       0       mean absolute 0
## 73          NA 56.34719      lin       23       0       mean absolute 0
## 61          NA 56.16855      lin       23       0       mean absolute 0
##      ModWeight sample.size Reference.day Reference.month Randomised
## 31  0.05504871          80             1               7         no
## 60  0.04122271          80             1               7         no
## 23  0.04035143          80             1               7         no
## 232 0.03907264          80             1               7         no
## 73  0.03822721          80             1               7         no
## 61  0.03472579          80             1               7         no
## 
## [[2]]
##     deltaAICc WindowOpen WindowClose   ModelBeta   Std.Error ModelBetaQ
## 232 -8.271720         21          21  0.03133884 0.009749385         NA
## 31  -8.040529          7           5 -0.14073998 0.044287529         NA
## 23  -7.885157          6           5 -0.09745162 0.030907018         NA
## 25  -7.876671          6           3 -0.12105236 0.038408621         NA
## 63  -7.657813         10           3 -0.15367592 0.049311878         NA
## 33  -7.630226          7           3 -0.14270778 0.045858260         NA
##     ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 232         NA 54.31994      lin       23       0       mean absolute 0
## 31          NA 55.63115      lin       23       0       mean absolute 0
## 23          NA 55.23145      lin       23       0       mean absolute 0
## 25          NA 55.64902      lin       23       0       mean absolute 0
## 63          NA 55.61541      lin       23       0       mean absolute 0
## 33          NA 55.82954      lin       23       0       mean absolute 0
##      ModWeight sample.size Reference.day Reference.month Randomised
## 232 0.05020483          80             1               7         no
## 31  0.04472424          80             1               7         no
## 23  0.04138132          80             1               7         no
## 25  0.04120611          80             1               7         no
## 63  0.03693491          80             1               7         no
## 33  0.03642896          80             1               7         no
load(file = paste0(modeldir, "RaRand.Rdata"))
randomized_models <- list()

for (i in 1:(length(RaRand)-1)) {
  randomized_models[[i]] <- head(RaRand[[i]])
}

randomized_models
## [[1]]
##       deltaAICc WindowOpen WindowClose  ModelBeta Std.Error ModelBetaQ
## 277  -0.9112374         23          23 -0.9527015 0.5569915         NA
## 278   0.6014854         23          22  0.7529007 0.6334868         NA
## 279   0.2897328         23          21  0.8301107 0.6321991         NA
## 137   0.4139612         16          16 -0.6475232 0.5119223         NA
## 155   1.5869429         17          16 -0.1926266 0.2942540         NA
## 2771  1.8470358         23          23 -0.2294427 0.5580433         NA
##      ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 277          NA 54.20434      lin       23       0       mean absolute 0
## 278          NA 55.02108      lin       23       0       mean absolute 0
## 279          NA 55.55371      lin       23       0       mean absolute 0
## 137          NA 57.81659      lin       23       0       mean absolute 0
## 155          NA 55.07879      lin       23       0       mean absolute 0
## 2771         NA 54.12201      lin       23       0       mean absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 277  0.011283030          80             1               7        yes      1
## 278  0.005492493          80             1               7        yes      2
## 279  0.006858946          80             1               7        yes      3
## 137  0.006183120          80             1               7        yes      4
## 155  0.003753598          80             1               7        yes      5
## 2771 0.003604939          80             1               7        yes      6
##      WeightDist
## 277   0.9366667
## 278   0.9400000
## 279   0.9400000
## 137   0.9433333
## 155   0.9433333
## 2771  0.9466667
## 
## [[2]]
##      deltaAICc WindowOpen WindowClose  ModelBeta Std.Error ModelBetaQ
## 122 -2.3366078         15          14  0.7166091 0.3434229         NA
## 7    0.6468023          3           3  0.6859561 0.5866389         NA
## 30  -1.8099242          7           6  1.2258196 0.6266928         NA
## 173  0.3181124         18          17  0.7928865 0.6088798         NA
## 254 -1.7942770         22          22 -1.1386158 0.5833074         NA
## 211 -2.7111716         20          20  0.4717214 0.2168967         NA
##     ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 122         NA 52.39326      lin       23       0       mean absolute 0
## 7           NA 44.09754      lin       23       0       mean absolute 0
## 30          NA 40.10981      lin       23       0       mean absolute 0
## 173         NA 52.96927      lin       23       0       mean absolute 0
## 254         NA 48.13238      lin       23       0       mean absolute 0
## 211         NA 58.57218      lin       23       0       mean absolute 0
##       ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 122 0.014579566          80             1               7        yes      1
## 7   0.006029902          80             1               7        yes      2
## 30  0.006243374          80             1               7        yes      3
## 173 0.004734644          80             1               7        yes      4
## 254 0.013604576          80             1               7        yes      5
## 211 0.011178410          80             1               7        yes      6
##     WeightDist
## 122  0.8966667
## 7    0.9433333
## 30   0.8700000
## 173  0.9233333
## 254  0.9166667
## 211  0.8466667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()

for (i in 1:(length(RaWin)-1)) {
  bestmod <- RaWin[[i]]$BestModel

  # Create residuals vs fitted plot
  p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
    geom_point(alpha = 0.5) +
    geom_hline(yintercept = 0, linetype = 2) +
    geom_smooth() +
    labs(title = "Residuals vs. Fitted Plot for the best model candidate",
         x = "Fitted Values",
         y = "Residuals")
  
}

for (i in 1:length(p_res_fit)) {
  plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "RaRand.Rdata"))

pvalues <- list()

for (i in 1:(length(RaRand)-1)) {
  pvalues[[i]] <- pvalue(dataset = RaWin[[i]]$Dataset,
                                   datasetrand = RaRand[[i]],
                                   metric = metric,
                                   sample.size = sample.size)
  
  print(pvalues[[i]])
}
## [1] 0.01
## [1] 0.015
# Plot all diagnostic plots for a given parameter combination

plotalls <- list()

for (i in 1:(length(RaWin)-1)) {
  RaOutput <- RaWin[[i]]$Dataset
  RaRand_data <- RaRand[[i]]
  WindowOpen <- RaWin[[i]]$Dataset[1, 2]
  WindowClose <- RaWin[[i]]$Dataset[1, 3]
  
  
  # Fit single best model
  RaSingle <- singlewin(xvar = xvar[i],
                        cdate = cdate,
                        bdate = bdate,
                        baseline = baseline,
                        cinterval = cinterval,
                        range = c(WindowOpen, WindowClose),
                        type = type, refday = refday,
                        stat = stat,
                        func = func,
                        spatial = list(df$place_id, daily$place_id))
  
  png(paste0(figuredir, "climwin_Ra_", RaWin$combos$climate[i], ".png"), width = 32 , height = 22, 
      units = "cm", res = 300)
  plotall(dataset = RaOutput,
          datasetrand = RaRand_data,
          bestmodel = RaSingle$BestModel,
          bestmodeldata = RaSingle$BestModelData)
  dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47
## Warning: The dot-dot notation (`..density..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(density)` instead.
## ℹ The deprecated feature was likely used in the climwin package.
##   Please report the issue at <https://github.com/LiamDBailey/climwin/issues>.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

Weight

# Null model with no climate signal
baseline <- lm(W ~ 1, data = df)

Candidate model fitting

# # Fit candidate model set
# WWin <- slidingwin(xvar = xvar,
#                     cdate = daily$date,
#                     bdate = df$date,
#                     baseline = baseline,
#                     cinterval = cinterval,
#                     range = range,
#                     type = type, refday = refday,
#                     stat = stat,
#                     func = func,
#                     spatial = list(df$place_id, daily$place_id))
# 
# save(WWin, file = paste0(modeldir, "WWin.Rdata"))
# rm(WWin)
# gc()
# 
# # Fit randomized model set for evaluation purposes
# WRand <- randwin(repeats = repeats,
#                   xvar = xvar,
#                   cdate = cdate,
#                   bdate = bdate,
#                   baseline = baseline,
#                   cinterval = cinterval,
#                   range = range,
#                   type = type, refday = refday,
#                   stat = stat,
#                   func = func,
#                   spatial = list(df$place_id, daily$place_id))
# 
# save(WRand, file = paste0(modeldir, "WRand.Rdata"))
# rm(WRand)
# gc()

Model diagnostics

load(file = paste0(modeldir, "WWin.Rdata"))
# Possible combinations of model parameters
WWin$combos
##   response climate     type stat func DeltaAICc WindowOpen WindowClose
## 1        W   Tmean absolute mean  lin   -135.21         12           1
## 2        W    Tmin absolute mean  lin   -134.73         11           0
candidate_models <- list()

for (i in 1:(length(WWin)-1)) {
  candidate_models[[i]] <- head(WWin[[i]]$Dataset)
}

candidate_models
## [[1]]
##    deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ ModelBetaC
## 90 -135.2087         12           1 -2.353106 0.1917869         NA         NA
## 89 -130.5201         12           2 -2.173549 0.1805567         NA         NA
## 76 -127.3969         11           2 -1.589634 0.1337819         NA         NA
## 77 -127.2708         11           1 -1.691312 0.1424147         NA         NA
## 75 -114.3714         11           3 -1.334596 0.1189802         NA         NA
## 70 -113.6503         11           8 -1.832388 0.1639086         NA         NA
##    ModelInt Function Furthest Closest Statistics     Type K    ModWeight
## 90 59.45743      lin       23       0       mean absolute 0 8.810454e-01
## 89 55.58989      lin       23       0       mean absolute 0 8.450666e-02
## 76 48.07177      lin       23       0       mean absolute 0 1.772941e-02
## 77 50.49553      lin       23       0       mean absolute 0 1.664629e-02
## 75 43.36865      lin       23       0       mean absolute 0 2.631709e-05
## 70 43.45472      lin       23       0       mean absolute 0 1.835081e-05
##    sample.size Reference.day Reference.month Randomised
## 90          80             1               7         no
## 89          80             1               7         no
## 76          80             1               7         no
## 77          80             1               7         no
## 75          80             1               7         no
## 70          80             1               7         no
## 
## [[2]]
##    deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ ModelBetaC
## 78 -134.7341         11           0 -1.750979 0.1429826         NA         NA
## 27 -133.4188          6           1 -1.358428 0.1115165         NA         NA
## 35 -132.7023          7           1 -1.487545 0.1224713         NA         NA
## 77 -130.7553         11           1 -1.377617 0.1143278         NA         NA
## 34 -130.4450          7           2 -1.323155 0.1099486         NA         NA
## 65 -129.7608         10           1 -1.561340 0.1301085         NA         NA
##    ModelInt Function Furthest Closest Statistics     Type K  ModWeight
## 78 42.39784      lin       23       0       mean absolute 0 0.43759003
## 27 42.13035      lin       23       0       mean absolute 0 0.22669710
## 35 42.99957      lin       23       0       mean absolute 0 0.15843804
## 77 37.74706      lin       23       0       mean absolute 0 0.05985121
## 34 40.41534      lin       23       0       mean absolute 0 0.05125090
## 65 40.74113      lin       23       0       mean absolute 0 0.03640276
##    sample.size Reference.day Reference.month Randomised
## 78          80             1               7         no
## 27          80             1               7         no
## 35          80             1               7         no
## 77          80             1               7         no
## 34          80             1               7         no
## 65          80             1               7         no
load(file = paste0(modeldir, "WRand.Rdata"))
randomized_models <- list()

for (i in 1:(length(WRand)-1)) {
  randomized_models[[i]] <- head(WRand[[i]])
}

randomized_models
## [[1]]
##      deltaAICc WindowOpen WindowClose  ModelBeta Std.Error ModelBetaQ
## 278  1.7392802         23          22  0.7465549 1.4196482         NA
## 1   -1.5723915          0           0  0.8030399 0.4239522         NA
## 237 -6.6168489         21          16  3.7672412 1.2801015         NA
## 137 -0.9839297         16          16 -1.9830729 1.1452334         NA
## 277 -1.8585084         23          23 -2.4535291 1.2464429         NA
## 279  0.4534431         23          21 -1.7686276 1.4158190         NA
##     ModelBetaC  ModelInt Function Furthest Closest Statistics     Type K
## 278         NA 24.596319      lin       23       0       mean absolute 0
## 1           NA  7.754403      lin       23       0       mean absolute 0
## 237         NA 19.715414      lin       23       0       mean absolute 0
## 137         NA 35.073743      lin       23       0       mean absolute 0
## 277         NA 23.958224      lin       23       0       mean absolute 0
## 279         NA 20.572922      lin       23       0       mean absolute 0
##       ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 278 0.003538857          80             1               7        yes      1
## 1   0.004939767          80             1               7        yes      2
## 237 0.026472945          80             1               7        yes      3
## 137 0.006690874          80             1               7        yes      4
## 277 0.019390154          80             1               7        yes      5
## 279 0.006391340          80             1               7        yes      6
##     WeightDist
## 278  0.9466667
## 1    0.8900000
## 237  0.8600000
## 137  0.9266667
## 277  0.9400000
## 279  0.9433333
## 
## [[2]]
##      deltaAICc WindowOpen WindowClose  ModelBeta Std.Error ModelBetaQ
## 11  -8.7102921          4           4  3.0761509 0.9370696         NA
## 3    1.5598889          1           0 -0.5854590 0.8674344         NA
## 4   -0.0995333          2           2  2.0205142 1.3898830         NA
## 94  -0.5767904         13          11 -2.4589742 1.5276913         NA
## 56  -0.2435277         10          10  0.7737837 0.5150140         NA
## 137 -2.6817542         16          16 -0.9930479 0.4580336         NA
##     ModelBetaC  ModelInt Function Furthest Closest Statistics     Type K
## 11          NA -7.786423      lin       23       0       mean absolute 0
## 3           NA 32.196954      lin       23       0       mean absolute 0
## 4           NA -5.291509      lin       23       0       mean absolute 0
## 94          NA 41.543646      lin       23       0       mean absolute 0
## 56          NA 18.188937      lin       23       0       mean absolute 0
## 137         NA 26.439505      lin       23       0       mean absolute 0
##       ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 11  0.026573464          80             1               7        yes      1
## 3   0.003944777          80             1               7        yes      2
## 4   0.006015264          80             1               7        yes      3
## 94  0.006114845          80             1               7        yes      4
## 56  0.007431424          80             1               7        yes      5
## 137 0.010698327          80             1               7        yes      6
##     WeightDist
## 11   0.5500000
## 3    0.9466667
## 4    0.9200000
## 94   0.9166667
## 56   0.9300000
## 137  0.8433333
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()

for (i in 1:(length(WWin)-1)) {
  bestmod <- WWin[[i]]$BestModel

  # Create residuals vs fitted plot
  p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
    geom_point(alpha = 0.5) +
    geom_hline(yintercept = 0, linetype = 2) +
    geom_smooth() +
    labs(title = "Residuals vs. Fitted Plot for the best model candidate",
         x = "Fitted Values",
         y = "Residuals")
  
}

for (i in 1:length(p_res_fit)) {
  plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "WRand.Rdata"))

pvalues <- list()

for (i in 1:(length(WRand)-1)) {
  pvalues[[i]] <- pvalue(dataset = WWin[[i]]$Dataset,
                                   datasetrand = WRand[[i]],
                                   metric = metric,
                                   sample.size = sample.size)
  
  print(pvalues[[i]])
}
## [1] "<0.001"
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()

for (i in 1:(length(WWin)-1)) {
  WOutput <- WWin[[i]]$Dataset
  WRand_data <- WRand[[i]]
  WindowOpen <- WWin[[i]]$Dataset[1, 2]
  WindowClose <- WWin[[i]]$Dataset[1, 3]
  
  
  # Fit single best model
  WSingle <- singlewin(xvar = xvar[i],
                        cdate = cdate,
                        bdate = bdate,
                        baseline = baseline,
                        cinterval = cinterval,
                        range = c(WindowOpen, WindowClose),
                        type = type, refday = refday,
                        stat = stat,
                        func = func,
                        spatial = list(df$place_id, daily$place_id))
  
  png(paste0(figuredir, "climwin_W_", WWin$combos$climate[i], ".png"), width = 32 , height = 22, 
      units = "cm", res = 300)
  plotall(dataset = WOutput,
          datasetrand = WRand_data,
          bestmodel = WSingle$BestModel,
          bestmodeldata = WSingle$BestModelData)
  dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

BMI

# Null model with no climate signal
baseline <- lm(BMI ~ 1, data = df)

Candidate model fitting

# Fit candidate model set
# BMIWin <- slidingwin(xvar = xvar,
#                     cdate = daily$date,
#                     bdate = df$date,
#                     baseline = baseline,
#                     cinterval = cinterval,
#                     range = range,
#                     type = type, refday = refday,
#                     stat = stat,
#                     func = func,
#                     spatial = list(df$place_id, daily$place_id))
# 
# save(BMIWin, file = paste0(modeldir, "BMIWin.Rdata"))
# rm(BMIWin)
# gc()
# 
# # Fit randomized model set for evaluation purposes
# BMIRand <- randwin(repeats = repeats,
#                   xvar = xvar,
#                   cdate = cdate,
#                   bdate = bdate,
#                   baseline = baseline,
#                   cinterval = cinterval,
#                   range = range,
#                   type = type, refday = refday,
#                   stat = stat,
#                   func = func,
#                   spatial = list(df$place_id, daily$place_id))
# 
# save(BMIRand, file = paste0(modeldir, "BMIRand.Rdata"))
# rm(BMIRand)
# gc()

Model diagnostics

load(file = paste0(modeldir, "BMIWin.Rdata"))
# Possible combinations of model parameters
BMIWin$combos
##   response climate     type stat func DeltaAICc WindowOpen WindowClose
## 1      BMI   Tmean absolute mean  lin   -174.43         11           1
## 2      BMI    Tmin absolute mean  lin   -178.05          7           1
candidate_models <- list()

for (i in 1:(length(BMIWin)-1)) {
  candidate_models[[i]] <- head(BMIWin[[i]]$Dataset)
}

candidate_models
## [[1]]
##    deltaAICc WindowOpen WindowClose   ModelBeta   Std.Error ModelBetaQ
## 77 -174.4295         11           1 -0.07766813 0.005505513         NA
## 90 -172.3767         12           1 -0.10462041 0.007464997         NA
## 91 -160.4069         12           0 -0.12467036 0.009256844         NA
## 35 -158.1235          7           1 -0.06120473 0.004580478         NA
## 78 -156.3795         11           0 -0.08721827 0.006567189         NA
## 27 -155.1897          6           1 -0.05299700 0.004007224         NA
##    ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 77         NA 1.231467      lin       23       0       mean absolute 0
## 90         NA 1.590732      lin       23       0       mean absolute 0
## 91         NA 1.951213      lin       23       0       mean absolute 0
## 35         NA 1.147511      lin       23       0       mean absolute 0
## 78         NA 1.420318      lin       23       0       mean absolute 0
## 27         NA 1.028236      lin       23       0       mean absolute 0
##       ModWeight sample.size Reference.day Reference.month Randomised
## 77 7.354500e-01          80             1               7         no
## 90 2.635080e-01          80             1               7         no
## 91 6.631108e-04          80             1               7         no
## 35 2.117104e-04          80             1               7         no
## 78 8.851959e-05          80             1               7         no
## 27 4.882919e-05          80             1               7         no
## 
## [[2]]
##    deltaAICc WindowOpen WindowClose   ModelBeta   Std.Error ModelBetaQ
## 35 -178.0494          7           1 -0.06764405 0.004740403         NA
## 27 -177.0630          6           1 -0.06147451 0.004321416         NA
## 54 -165.0565          9           1 -0.06570544 0.004802372         NA
## 36 -164.4469          7           0 -0.08332423 0.006102579         NA
## 28 -164.2161          6           0 -0.07837909 0.005744858         NA
## 66 -161.4739         10           0 -0.08813650 0.006520326         NA
##    ModelBetaC  ModelInt Function Furthest Closest Statistics     Type K
## 35         NA 0.8785793      lin       23       0       mean absolute 0
## 27         NA 0.8350032      lin       23       0       mean absolute 0
## 54         NA 0.7678112      lin       23       0       mean absolute 0
## 36         NA 1.1131405      lin       23       0       mean absolute 0
## 28         NA 1.0911800      lin       23       0       mean absolute 0
## 66         NA 1.0034297      lin       23       0       mean absolute 0
##       ModWeight sample.size Reference.day Reference.month Randomised
## 35 0.6193364469          80             1               7         no
## 27 0.3782085551          80             1               7         no
## 54 0.0009344266          80             1               7         no
## 36 0.0006889429          80             1               7         no
## 28 0.0006138386          80             1               7         no
## 66 0.0001558139          80             1               7         no
load(file = paste0(modeldir, "BMIRand.Rdata"))
randomized_models <- list()

for (i in 1:(length(BMIRand)-1)) {
  randomized_models[[i]] <- head(BMIRand[[i]])
}

randomized_models
## [[1]]
##       deltaAICc WindowOpen WindowClose   ModelBeta  Std.Error ModelBetaQ
## 278   1.7856261         23          22  0.02720195 0.05668212         NA
## 191   1.4217199         19          19  0.01793895 0.02328707         NA
## 277  -1.1081828         23          23  0.08798754 0.04979054         NA
## 2771 -0.5028966         23          23 -0.07902360 0.04981116         NA
## 29    1.6411011          7           7  0.01903874 0.03111101         NA
## 211   1.6483584         20          20  0.02963739 0.04890527         NA
##      ModelBetaC    ModelInt Function Furthest Closest Statistics     Type K
## 278          NA  0.03342645      lin       23       0       mean absolute 0
## 191          NA -0.01111284      lin       23       0       mean absolute 0
## 277          NA -0.01001583      lin       23       0       mean absolute 0
## 2771         NA  0.00899544      lin       23       0       mean absolute 0
## 29           NA -0.28434623      lin       23       0       mean absolute 0
## 211          NA  0.19839141      lin       23       0       mean absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 278  0.003652545          80             1               7        yes      1
## 191  0.003972667          80             1               7        yes      2
## 277  0.012082502          80             1               7        yes      3
## 2771 0.010364197          80             1               7        yes      4
## 29   0.003683195          80             1               7        yes      5
## 211  0.003798873          80             1               7        yes      6
##      WeightDist
## 278   0.9466667
## 191   0.9433333
## 277   0.9333333
## 2771  0.9400000
## 29    0.9433333
## 211   0.9466667
## 
## [[2]]
##      deltaAICc WindowOpen WindowClose   ModelBeta  Std.Error ModelBetaQ
## 37   1.6326127          8           8 -0.02896578 0.04680636         NA
## 47  -0.4875348          9           8  0.06995276 0.04422879         NA
## 193  1.5256214         19          17 -0.02127662 0.03040358         NA
## 277  0.3335591         23          23  0.02560200 0.01975061         NA
## 172  0.5516240         18          18  0.04029047 0.03331770         NA
## 13  -5.0087298          4           2  0.15074927 0.05681653         NA
##     ModelBetaC     ModelInt Function Furthest Closest Statistics     Type K
## 37          NA  0.138121425      lin       23       0       mean absolute 0
## 47          NA -0.424075715      lin       23       0       mean absolute 0
## 193         NA  0.006623042      lin       23       0       mean absolute 0
## 277         NA  0.057488317      lin       23       0       mean absolute 0
## 172         NA -0.080982509      lin       23       0       mean absolute 0
## 13          NA -1.966914276      lin       23       0       mean absolute 0
##       ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 37  0.003759648          80             1               7        yes      1
## 47  0.004799659          80             1               7        yes      2
## 193 0.003834957          80             1               7        yes      3
## 277 0.005626720          80             1               7        yes      4
## 172 0.004763455          80             1               7        yes      5
## 13  0.010336698          80             1               7        yes      6
##     WeightDist
## 37   0.9433333
## 47   0.8933333
## 193  0.9433333
## 277  0.9300000
## 172  0.9266667
## 13   0.7966667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()

for (i in 1:(length(BMIWin)-1)) {
  bestmod <- BMIWin[[i]]$BestModel

  # Create residuals vs fitted plot
  p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
    geom_point(alpha = 0.5) +
    geom_hline(yintercept = 0, linetype = 2) +
    geom_smooth() +
    labs(title = "Residuals vs. Fitted Plot for the best model candidate",
         x = "Fitted Values",
         y = "Residuals")
  
}

for (i in 1:length(p_res_fit)) {
  plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "BMIRand.Rdata"))

pvalues <- list()

for (i in 1:(length(BMIRand)-1)) {
  pvalues[[i]] <- pvalue(dataset = BMIWin[[i]]$Dataset,
                                   datasetrand = BMIRand[[i]],
                                   metric = metric,
                                   sample.size = sample.size)
  
  print(pvalues[[i]])
}
## [1] "<0.001"
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()

for (i in 1:(length(BMIWin)-1)) {
  BMIOutput <- BMIWin[[i]]$Dataset
  BMIRand_data <- BMIRand[[i]]
  WindowOpen <- BMIWin[[i]]$Dataset[1, 2]
  WindowClose <- BMIWin[[i]]$Dataset[1, 3]
  
  
  # Fit single best model
  BMISingle <- singlewin(xvar = xvar[i],
                        cdate = cdate,
                        bdate = bdate,
                        baseline = baseline,
                        cinterval = cinterval,
                        range = c(WindowOpen, WindowClose),
                        type = type, refday = refday,
                        stat = stat,
                        func = func,
                        spatial = list(df$place_id, daily$place_id))
  
  png(paste0(figuredir, "climwin_BMI_", BMIWin$combos$climate[i], ".png"), width = 32 , height = 22, 
      units = "cm", res = 300)
  plotall(dataset = BMIOutput,
          datasetrand = BMIRand_data,
          bestmodel = BMISingle$BestModel,
          bestmodeldata = BMISingle$BestModelData)
  dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

Precipitation

Currently only average daily precipitation (Prcp) available.

# Set priors
# List of dependent variable candidates
xvar <- list(Prcp = daily$Prcp)

stat <- "sum"

Ra

# Null model with no climate signal
baseline <- lm(Ra ~ 1, data = df)

Candidate model fitting

# # Fit candidate model set
# RaWin <- slidingwin(xvar = xvar,
#                     cdate = daily$date,
#                     bdate = df$date,
#                     baseline = baseline,
#                     cinterval = cinterval,
#                     range = range,
#                     type = type, refday = refday,
#                     stat = stat,
#                     func = func,
#                     spatial = list(df$place_id, daily$place_id))
# 
# save(RaWin, file = paste0(modeldir, "RaWin_Prcp.Rdata"))
# rm(RaWin)
# gc()
# 
# # Fit randomized model set for evaluation purposes
# RaRand <- randwin(repeats = repeats,
#                   xvar = xvar,
#                   cdate = cdate,
#                   bdate = bdate,
#                   baseline = baseline,
#                   cinterval = cinterval,
#                   range = range,
#                   type = type, refday = refday,
#                   stat = stat,
#                   func = func,
#                   spatial = list(df$place_id, daily$place_id))
# 
# save(RaRand, file = paste0(modeldir, "RaRand_Prcp.Rdata"))
# rm(RaRand)
# gc()

Model diagnostics

load(file = paste0(modeldir, "RaWin_Prcp.Rdata"))
# Possible combinations of model parameters
RaWin$combos
##   response climate     type stat func DeltaAICc WindowOpen WindowClose
## 1       Ra    Prcp absolute  sum  lin     -9.48         23           0
candidate_models <- list()

for (i in 1:(length(RaWin)-1)) {
  candidate_models[[i]] <- head(RaWin[[i]]$Dataset)
}

candidate_models
## [[1]]
##     deltaAICc WindowOpen WindowClose   ModelBeta  Std.Error ModelBetaQ
## 300 -9.479733         23           0  0.07261819 0.02136251         NA
## 11  -9.375061          4           4 -0.16210635 0.04790806         NA
## 14  -8.885283          4           1 -0.10577676 0.03196058         NA
## 2   -8.564872          1           1 -0.16930650 0.05193061         NA
## 20  -7.766115          5           1 -0.10801827 0.03446747         NA
## 276 -7.208573         22           0  0.06600546 0.02169290         NA
##     ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 300         NA 51.89125      lin       23       0        sum absolute 0
## 11          NA 54.20336      lin       23       0        sum absolute 0
## 14          NA 54.43722      lin       23       0        sum absolute 0
## 2           NA 54.23251      lin       23       0        sum absolute 0
## 20          NA 54.60563      lin       23       0        sum absolute 0
## 276         NA 52.24956      lin       23       0        sum absolute 0
##      ModWeight sample.size Reference.day Reference.month Randomised
## 300 0.11063531          80             1               7         no
## 11  0.10499403          80             1               7         no
## 14  0.08218841          80             1               7         no
## 2   0.07002197          80             1               7         no
## 20  0.04696630          80             1               7         no
## 276 0.03554002          80             1               7         no
load(file = paste0(modeldir, "RaRand_Prcp.Rdata"))
randomized_models <- list()

for (i in 1:(length(RaRand)-1)) {
  randomized_models[[i]] <- head(RaRand[[i]])
}

randomized_models
## [[1]]
##       deltaAICc WindowOpen WindowClose  ModelBeta Std.Error ModelBetaQ
## 82   -0.5282470         12           9 -0.5664693 0.3552781         NA
## 105  -2.3651957         13           0  0.4811871 0.2298451         NA
## 24   -0.3677721          6           4  3.9568309 2.5639388         NA
## 58   -2.1946702         10           8  1.0582498 0.5156493         NA
## 232   0.0000000         21          21         NA        NA         NA
## 2321  0.0000000         21          21         NA        NA         NA
##      ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 82           NA 59.72266      lin       23       0        sum absolute 0
## 105          NA 43.77419      lin       23       0        sum absolute 0
## 24           NA 46.95370      lin       23       0        sum absolute 0
## 58           NA 44.73846      lin       23       0        sum absolute 0
## 232          NA 54.09590      lin       23       0        sum absolute 0
## 2321         NA 54.09590      lin       23       0        sum absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 82   0.008196999          80             1               7        yes      1
## 105  0.012706604          80             1               7        yes      2
## 24   0.010422264          80             1               7        yes      3
## 58   0.012919551          80             1               7        yes      4
## 232  0.008365568          80             1               7        yes      5
## 2321 0.006893276          80             1               7        yes      6
##      WeightDist
## 82    0.9266667
## 105   0.8800000
## 24    0.9466667
## 58    0.8933333
## 232   0.9433333
## 2321  0.9333333
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()

for (i in 1:(length(RaWin)-1)) {
  bestmod <- RaWin[[i]]$BestModel

  # Create residuals vs fitted plot
  p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
    geom_point(alpha = 0.5) +
    geom_hline(yintercept = 0, linetype = 2) +
    geom_smooth() +
    labs(title = "Residuals vs. Fitted Plot for the best model candidate",
         x = "Fitted Values",
         y = "Residuals")
  
}

for (i in 1:length(p_res_fit)) {
  plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "RaRand_Prcp.Rdata"))

pvalues <- list()

for (i in 1:(length(RaRand)-1)) {
  pvalues[[i]] <- pvalue(dataset = RaWin[[i]]$Dataset,
                                   datasetrand = RaRand[[i]],
                                   metric = metric,
                                   sample.size = sample.size)
  
  print(pvalues[[i]])
}
## [1] 0.01
# Plot all diagnostic plots for a given parameter combination

plotalls <- list()

for (i in 1:(length(RaWin)-1)) {
  RaOutput <- RaWin[[i]]$Dataset
  RaRand_data <- RaRand[[i]]
  WindowOpen <- RaWin[[i]]$Dataset[1, 2]
  WindowClose <- RaWin[[i]]$Dataset[1, 3]
  
  
  # Fit single best model
  RaSingle <- singlewin(xvar = xvar[i],
                        cdate = cdate,
                        bdate = bdate,
                        baseline = baseline,
                        cinterval = cinterval,
                        range = c(WindowOpen, WindowClose),
                        type = type, refday = refday,
                        stat = stat,
                        func = func,
                        spatial = list(df$place_id, daily$place_id))
  
  png(paste0(figuredir, "climwin_Ra_", RaWin$combos$climate[i], ".png"), width = 32 , height = 22, 
      units = "cm", res = 300)
  plotall(dataset = RaOutput,
          datasetrand = RaRand_data,
          bestmodel = RaSingle$BestModel,
          bestmodeldata = RaSingle$BestModelData)
  dev.off()
}
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

W

# Null model with no climate signal
baseline <- lm(W ~ 1, data = df)

Candidate model fitting

# # Fit candidate model set
# WWin <- slidingwin(xvar = xvar,
#                     cdate = daily$date,
#                     bdate = df$date,
#                     baseline = baseline,
#                     cinterval = cinterval,
#                     range = range,
#                     type = type, refday = refday,
#                     stat = stat,
#                     func = func,
#                     spatial = list(df$place_id, daily$place_id))
# 
# save(WWin, file = paste0(modeldir, "WWin_Prcp.Rdata"))
# rm(WWin)
# gc()
# 
# # Fit randomized model set for evaluation purposes
# WRand <- randwin(repeats = repeats,
#                   xvar = xvar,
#                   cdate = cdate,
#                   bdate = bdate,
#                   baseline = baseline,
#                   cinterval = cinterval,
#                   range = range,
#                   type = type, refday = refday,
#                   stat = stat,
#                   func = func,
#                   spatial = list(df$place_id, daily$place_id))
# 
# save(WRand, file = paste0(modeldir, "WRand_Prcp.Rdata"))
# rm(WRand)
# gc()

Model diagnostics

load(file = paste0(modeldir, "WWin_Prcp.Rdata"))
# Possible combinations of model parameters
WWin$combos
##   response climate     type stat func DeltaAICc WindowOpen WindowClose
## 1        W    Prcp absolute  sum  lin   -123.06         22          22
candidate_models <- list()

for (i in 1:(length(WWin)-1)) {
  candidate_models[[i]] <- head(WWin[[i]]$Dataset)
}

candidate_models
## [[1]]
##      deltaAICc WindowOpen WindowClose  ModelBeta  Std.Error ModelBetaQ
## 254 -123.06368         22          22 -1.3253373 0.11362773         NA
## 19  -109.58112          5           2 -1.0983371 0.10016521         NA
## 54  -104.92159          9           1 -0.9966768 0.09300497         NA
## 44  -103.27344          8           1 -0.7751790 0.07294194         NA
## 124  -92.58077         15          12  0.8068519 0.08039843         NA
## 142  -91.84189         16          11  0.7625155 0.07629873         NA
##     ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 254         NA 24.67660      lin       23       0        sum absolute 0
## 19          NA 27.97575      lin       23       0        sum absolute 0
## 54          NA 34.19336      lin       23       0        sum absolute 0
## 44          NA 31.32547      lin       23       0        sum absolute 0
## 124         NA 20.74609      lin       23       0        sum absolute 0
## 142         NA 18.36155      lin       23       0        sum absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised
## 254 9.986548e-01          80             1               7         no
## 19  1.179541e-03          80             1               7         no
## 54  1.147913e-04          80             1               7         no
## 44  5.035219e-05          80             1               7         no
## 124 2.399576e-07          80             1               7         no
## 142 1.658396e-07          80             1               7         no
load(file = paste0(modeldir, "WRand_Prcp.Rdata"))
randomized_models <- list()

for (i in 1:(length(WRand)-1)) {
  randomized_models[[i]] <- head(WRand[[i]])
}

randomized_models
## [[1]]
##       deltaAICc WindowOpen WindowClose ModelBeta Std.Error ModelBetaQ
## 232   0.0000000         21          21        NA        NA         NA
## 278  -2.6691459         23          22  2.826055  1.305248         NA
## 2321  0.0000000         21          21        NA        NA         NA
## 21   -0.6216693          5           0 -2.916436  1.796394         NA
## 2322  0.0000000         21          21        NA        NA         NA
## 2323  0.0000000         21          21        NA        NA         NA
##      ModelBetaC ModelInt Function Furthest Closest Statistics     Type K
## 232          NA 23.67893      lin       23       0        sum absolute 0
## 278          NA 13.46661      lin       23       0        sum absolute 0
## 2321         NA 23.67893      lin       23       0        sum absolute 0
## 21           NA 41.13023      lin       23       0        sum absolute 0
## 2322         NA 23.67893      lin       23       0        sum absolute 0
## 2323         NA 23.67893      lin       23       0        sum absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 232  0.008746884          80             1               7        yes      1
## 278  0.008070136          80             1               7        yes      2
## 2321 0.008848540          80             1               7        yes      3
## 21   0.006794485          80             1               7        yes      4
## 2322 0.007440463          80             1               7        yes      5
## 2323 0.007443843          80             1               7        yes      6
##      WeightDist
## 232   0.9466667
## 278   0.8266667
## 2321  0.9466667
## 21    0.9100000
## 2322  0.9366667
## 2323  0.9366667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()

for (i in 1:(length(WWin)-1)) {
  bestmod <- WWin[[i]]$BestModel

  # Create residuals vs fitted plot
  p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
    geom_point(alpha = 0.5) +
    geom_hline(yintercept = 0, linetype = 2) +
    geom_smooth() +
    labs(title = "Residuals vs. Fitted Plot for the best model candidate",
         x = "Fitted Values",
         y = "Residuals")
  
}

for (i in 1:length(p_res_fit)) {
  plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "WRand_Prcp.Rdata"))

pvalues <- list()

for (i in 1:(length(WRand)-1)) {
  pvalues[[i]] <- pvalue(dataset = WWin[[i]]$Dataset,
                                   datasetrand = WRand[[i]],
                                   metric = metric,
                                   sample.size = sample.size)
  
  print(pvalues[[i]])
}
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination

plotalls <- list()

for (i in 1:(length(WWin)-1)) {
  WOutput <- WWin[[i]]$Dataset
  WRand_data <- WRand[[i]]
  WindowOpen <- WWin[[i]]$Dataset[1, 2]
  WindowClose <- WWin[[i]]$Dataset[1, 3]
  
  
  # Fit single best model
  WSingle <- singlewin(xvar = xvar[i],
                        cdate = cdate,
                        bdate = bdate,
                        baseline = baseline,
                        cinterval = cinterval,
                        range = c(WindowOpen, WindowClose),
                        type = type, refday = refday,
                        stat = stat,
                        func = func,
                        spatial = list(df$place_id, daily$place_id))
  
  png(paste0(figuredir, "climwin_W_", WWin$combos$climate[i], ".png"), width = 32 , height = 22, 
      units = "cm", res = 300)
  plotall(dataset = WOutput,
          datasetrand = WRand_data,
          bestmodel = WSingle$BestModel,
          bestmodeldata = WSingle$BestModelData)
  dev.off()
}
## Warning in plotwin(dataset = dataset, cw = cwa): Top window has a weight
## greater than 0.95. Plotting single best window only.
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

BMI

# Null model with no climate signal
baseline <- lm(BMI ~ 1, data = df)

Candidate model fitting

# # Fit candidate model set
# BMIWin <- slidingwin(xvar = xvar,
#                     cdate = daily$date,
#                     bdate = df$date,
#                     baseline = baseline,
#                     cinterval = cinterval,
#                     range = range,
#                     type = type, refday = refday,
#                     stat = stat,
#                     func = func,
#                     spatial = list(df$place_id, daily$place_id))
# 
# save(BMIWin, file = paste0(modeldir, "BMIWin_Prcp.Rdata"))
# rm(BMIWin)
# gc()
# 
# # Fit randomized model set for evaluation purposes
# BMIRand <- randwin(repeats = repeats,
#                   xvar = xvar,
#                   cdate = cdate,
#                   bdate = bdate,
#                   baseline = baseline,
#                   cinterval = cinterval,
#                   range = range,
#                   type = type, refday = refday,
#                   stat = stat,
#                   func = func,
#                   spatial = list(df$place_id, daily$place_id))
# 
# save(BMIRand, file = paste0(modeldir, "BMIRand_Prcp.Rdata"))
# rm(BMIRand)
# gc()

Model diagnostics

load(file = paste0(modeldir, "BMIWin_Prcp.Rdata"))
# Possible combinations of model parameters
BMIWin$combos
##   response climate     type stat func DeltaAICc WindowOpen WindowClose
## 1      BMI    Prcp absolute  sum  lin   -170.76         22          22
candidate_models <- list()

for (i in 1:(length(BMIWin)-1)) {
  candidate_models[[i]] <- head(BMIWin[[i]]$Dataset)
}

candidate_models
## [[1]]
##     deltaAICc WindowOpen WindowClose   ModelBeta   Std.Error ModelBetaQ
## 254 -170.7631         22          22 -0.06121892 0.004391032         NA
## 44  -145.9808          8           1 -0.03617652 0.002828411         NA
## 19  -138.6339          5           2 -0.04875724 0.003920471         NA
## 142 -126.1537         16          11  0.03517142 0.002975613         NA
## 41  -123.9901          8           4 -0.02465808 0.002105587         NA
## 43  -121.5037          8           2 -0.03452878 0.002980593         NA
##     ModelBetaC    ModelInt Function Furthest Closest Statistics     Type K
## 254         NA  0.04608336      lin       23       0        sum absolute 0
## 44          NA  0.35685323      lin       23       0        sum absolute 0
## 19          NA  0.19074364      lin       23       0        sum absolute 0
## 142         NA -0.24526696      lin       23       0        sum absolute 0
## 41          NA  0.18001181      lin       23       0        sum absolute 0
## 43          NA  0.31273865      lin       23       0        sum absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised
## 254 9.999957e-01          80             1               7         no
## 44  4.155110e-06          80             1               7         no
## 19  1.054958e-07          80             1               7         no
## 142 2.056803e-10          80             1               7         no
## 41  6.972311e-11          80             1               7         no
## 43  2.011211e-11          80             1               7         no
load(file = paste0(modeldir, "BMIRand_Prcp.Rdata"))
randomized_models <- list()

for (i in 1:(length(BMIRand)-1)) {
  randomized_models[[i]] <- head(BMIRand[[i]])
}

randomized_models
## [[1]]
##      deltaAICc WindowOpen WindowClose  ModelBeta  Std.Error ModelBetaQ
## 21   -1.329392          5           0 -0.1311001 0.07168742         NA
## 232   0.000000         21          21         NA         NA         NA
## 2321  0.000000         21          21         NA         NA         NA
## 2322  0.000000         21          21         NA         NA         NA
## 217  -6.687921         20          14  0.1322217 0.04474381         NA
## 2323  0.000000         21          21         NA         NA         NA
##      ModelBetaC      ModelInt Function Furthest Closest Statistics     Type K
## 21           NA  7.844737e-01      lin       23       0        sum absolute 0
## 232          NA -9.645433e-18      lin       23       0        sum absolute 0
## 2321         NA -9.645433e-18      lin       23       0        sum absolute 0
## 2322         NA -9.645433e-18      lin       23       0        sum absolute 0
## 217          NA -8.521217e-01      lin       23       0        sum absolute 0
## 2323         NA -9.645433e-18      lin       23       0        sum absolute 0
##        ModWeight sample.size Reference.day Reference.month Randomised Repeat
## 21   0.009468525          80             1               7        yes      1
## 232  0.006276792          80             1               7        yes      2
## 2321 0.007850904          80             1               7        yes      3
## 2322 0.007588910          80             1               7        yes      4
## 217  0.051714927          80             1               7        yes      5
## 2323 0.008778716          80             1               7        yes      6
##      WeightDist
## 21    0.9066667
## 232   0.9266667
## 2321  0.9400000
## 2322  0.9400000
## 217   0.7600000
## 2323  0.9466667
# Plot residuals against fitted values to check dependency function
# (linear, quadratic, cubic, etc.)
p_res_fit <- list()

for (i in 1:(length(BMIWin)-1)) {
  bestmod <- BMIWin[[i]]$BestModel

  # Create residuals vs fitted plot
  p_res_fit[[i]] <- ggplot(bestmod, aes(x = .fitted, y = .resid)) +
    geom_point(alpha = 0.5) +
    geom_hline(yintercept = 0, linetype = 2) +
    geom_smooth() +
    labs(title = "Residuals vs. Fitted Plot for the best model candidate",
         x = "Fitted Values",
         y = "Residuals")
  
}

for (i in 1:length(p_res_fit)) {
  plot(p_res_fit[[i]])
}
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'

# Estimate how likely our observed result would be at random
# load(file = paste0(modeldir, "BMIRand_Prcp.Rdata"))

pvalues <- list()

for (i in 1:(length(BMIRand)-1)) {
  pvalues[[i]] <- pvalue(dataset = BMIWin[[i]]$Dataset,
                                   datasetrand = BMIRand[[i]],
                                   metric = metric,
                                   sample.size = sample.size)
  
  print(pvalues[[i]])
}
## [1] "<0.001"
# Plot all diagnostic plots for a given parameter combination
plotalls <- list()

for (i in 1:(length(BMIWin)-1)) {
  BMIOutput <- BMIWin[[i]]$Dataset
  BMIRand_data <- BMIRand[[i]]
  WindowOpen <- BMIWin[[i]]$Dataset[1, 2]
  WindowClose <- BMIWin[[i]]$Dataset[1, 3]
  
  
  # Fit single best model
  BMISingle <- singlewin(xvar = xvar[i],
                        cdate = cdate,
                        bdate = bdate,
                        baseline = baseline,
                        cinterval = cinterval,
                        range = c(WindowOpen, WindowClose),
                        type = type, refday = refday,
                        stat = stat,
                        func = func,
                        spatial = list(df$place_id, daily$place_id))
  
  png(paste0(figuredir, "climwin_BMI_", BMIWin$combos$climate[i], ".png"), width = 32 , height = 22, 
      units = "cm", res = 300)
  plotall(dataset = BMIOutput,
          datasetrand = BMIRand_data,
          bestmodel = BMISingle$BestModel,
          bestmodeldata = BMISingle$BestModelData)
  dev.off()
}
## Warning in plotwin(dataset = dataset, cw = cwa): Top window has a weight
## greater than 0.95. Plotting single best window only.
## Warning in pvalue(datasetrand = datasetrand, dataset = dataset, metric = "C", :
## Pc will be overly conservative when sample size is greater than 47

End of script